perturbation analysis
- North America > United States (0.46)
- North America > Canada (0.04)
- Energy (0.68)
- Government > Regional Government (0.46)
Automatic Perturbation Analysis for Scalable Certified Robustness and Beyond Kaidi Xu
The majority of LiRP A-based methods focus on simple feed-forward networks and need particular manual derivations and implementations when extended to other architectures. In this paper, we develop an automatic framework to enable perturbation analysis on any neural network structures, by generalizing existing LiRP A algorithms such as CROWN to operate on general computational graphs.
- North America > United States (0.46)
- North America > Canada (0.04)
- Energy (0.68)
- Government > Regional Government (0.46)
Why Do Class-Dependent Evaluation Effects Occur with Time Series Feature Attributions? A Synthetic Data Investigation
Baer, Gregor, Grau, Isel, Zhang, Chao, Van Gorp, Pieter
Evaluating feature attribution methods represents a critical challenge in explainable AI (XAI), as researchers typically rely on perturbation-based metrics when ground truth is unavailable. However, recent work reveals that these evaluation metrics can show different performance across predicted classes within the same dataset. These "class-dependent evaluation effects" raise questions about whether perturbation analysis reliably measures attribution quality, with direct implications for XAI method development and evaluation trustworthiness. We investigate under which conditions these class-dependent effects arise by conducting controlled experiments with synthetic time series data where ground truth feature locations are known. We systematically vary feature types and class contrasts across binary classification tasks, then compare perturbation-based degradation scores with ground truth-based precision-recall metrics using multiple attribution methods. Our experiments demonstrate that class-dependent effects emerge with both evaluation approaches, even in simple scenarios with temporally localized features, triggered by basic variations in feature amplitude or temporal extent between classes. Most critically, we find that perturbation-based and ground truth metrics frequently yield contradictory assessments of attribution quality across classes, with weak correlations between evaluation approaches. These findings suggest that researchers should interpret perturbation-based metrics with care, as they may not always align with whether attributions correctly identify discriminating features. By showing this disconnect, our work points toward reconsidering what attribution evaluation actually measures and developing more rigorous evaluation methods that capture multiple dimensions of attribution quality.
- Europe > Netherlands > North Brabant > Eindhoven (0.05)
- Europe > Switzerland (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.86)
Perturbation Analysis of Singular Values in Concatenated Matrices
Concatenating matrices is a common technique for uncovering shared structures in data through singular value decomposition (SVD) and low-rank approximations. The fundamental question arises: How does the singular value spectrum of the concatenated matrix relate to the spectra of its individual components? In the present work, we develop a perturbation technique that extends classical results such as Weyl's inequality to concatenated matrices. We setup analytical bounds that quantify stability of singular values under small perturbations in submatrices. The results demonstrate that if submatrices are close in a norm, dominant singular values of the concatenated matrix remain stable enabling controlled trade-offs between accuracy and compression. These provide a theoretical basis for improved matrix clustering and compression strategies with applications in the numerical linear algebra, signal processing, and data-driven modeling.
- Europe > Ukraine > Kyiv Oblast > Kyiv (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Maryland (0.04)
- North America > United States > California > San Diego County > San Diego (0.04)
Statistical Hypothesis Testing for Auditing Robustness in Language Models
Rauba, Paulius, Wei, Qiyao, van der Schaar, Mihaela
Consider the problem of testing whether the outputs of a large language model (LLM) system change under an arbitrary intervention, such as an input perturbation or changing the model variant. We cannot simply compare two LLM outputs since they might differ due to the stochastic nature of the system, nor can we compare the entire output distribution due to computational intractability. While existing methods for analyzing text-based outputs exist, they focus on fundamentally different problems, such as measuring bias or fairness. To this end, we introduce distribution-based perturbation analysis, a framework that reformulates LLM perturbation analysis as a frequentist hypothesis testing problem. We construct empirical null and alternative output distributions within a low-dimensional semantic similarity space via Monte Carlo sampling, enabling tractable inference without restrictive distributional assumptions. The framework is (i) model-agnostic, (ii) supports the evaluation of arbitrary input perturbations on any black-box LLM, (iii) yields interpretable p-values; (iv) supports multiple perturbations via controlled error rates; and (v) provides scalar effect sizes. We demonstrate the usefulness of the framework across multiple case studies, showing how we can quantify response changes, measure true/false positive rates, and evaluate alignment with reference models. Above all, we see this as a reliable frequentist hypothesis testing framework for LLM auditing.
- North America > Canada (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Research Report > New Finding (0.91)
- Research Report > Experimental Study (0.68)
Reviews: Implicit Regularization of Discrete Gradient Dynamics in Linear Neural Networks
The paper addresses an important topic (implicit regularization in deep learning), is well-written, and although I did not verify proofs in the appendixes, I believe it is mathematically solid. Nonetheless, I have several concerns with regards to originality, significance and clarity (see below), which ultimately lead me to vote against its acceptance. This is true for the proof techniques as well. The only part I view as potentially novel is the perturbation analysis, but that is unfortunately not discussed at all in the body of the paper. I recommend to the authors to put much more focus on this aspect, as without it the paper is merely a straightforward extension of prior work.
Bounded-Regret MPC via Perturbation Analysis: Prediction Error, Constraints, and Nonlinearity
We study Model Predictive Control (MPC) and propose a general analysis pipeline to bound its dynamic regret. The pipeline first requires deriving a perturbation bound for a finite-time optimal control problem. Then, the perturbation bound is used to bound the per-step error of MPC, which leads to a bound on the dynamic regret. Thus, our pipeline reduces the study of MPC to the well-studied problem of perturbation analysis, enabling the derivation of regret bounds of MPC under a variety of settings. To demonstrate the power of our pipeline, we use it to generalize existing regret bounds on MPC in linear time-varying (LTV) systems to incorporate prediction errors on costs, dynamics, and disturbances.
Quantifying perturbation impacts for large language models
Rauba, Paulius, Wei, Qiyao, van der Schaar, Mihaela
We consider the problem of quantifying how an input perturbation impacts the outputs of large language models (LLMs), a fundamental task for model reliability and post-hoc interpretability. A key obstacle in this domain is disentangling the meaningful changes in model responses from the intrinsic stochasticity of LLM outputs. To overcome this, we introduce Distribution-Based Perturbation Analysis (DBPA), a framework that reformulates LLM perturbation analysis as a frequentist hypothesis testing problem. DBPA constructs empirical null and alternative output distributions within a low-dimensional semantic similarity space via Monte Carlo sampling. Comparisons of Monte Carlo estimates in the reduced dimensionality space enables tractable frequentist inference without relying on restrictive distributional assumptions. The framework is model-agnostic, supports the evaluation of arbitrary input perturbations on any black-box LLM, yields interpretable p-values, supports multiple perturbation testing via controlled error rates, and provides scalar effect sizes for any chosen similarity or distance metric. We demonstrate the effectiveness of DBPA in evaluating perturbation impacts, showing its versatility for perturbation analysis.
- Research Report > New Finding (0.91)
- Research Report > Experimental Study (0.70)
Benchmarking Transcriptomics Foundation Models for Perturbation Analysis : one PCA still rules them all
Bendidi, Ihab, Whitfield, Shawn, Kenyon-Dean, Kian, Yedder, Hanene Ben, Mesbahi, Yassir El, Noutahi, Emmanuel, Denton, Alisandra K.
Understanding the relationships among genes, compounds, and their interactions in living organisms remains limited due to technological constraints and the complexity of biological data. Deep learning has shown promise in exploring these relationships using various data types. However, transcriptomics, which provides detailed insights into cellular states, is still underused due to its high noise levels and limited data availability. Recent advancements in transcriptomics sequencing provide new opportunities to uncover valuable insights, especially with the rise of many new foundation models for transcriptomics, yet no benchmark has been made to robustly evaluate the effectiveness of these rising models for perturbation analysis. This article presents a novel biologically motivated evaluation framework and a hierarchy of perturbation analysis tasks for comparing the performance of pretrained foundation models to each other and to more classical techniques of learning from transcriptomics data. We compile diverse public datasets from different sequencing techniques and cell lines to assess models performance. Our approach identifies scVI and PCA to be far better suited models for understanding biological perturbations in comparison to existing foundation models, especially in their application in real-world scenarios.
- North America > Canada > Quebec > Montreal (0.14)
- North America > Canada > Ontario > Toronto (0.14)
- Asia > Middle East > Republic of Türkiye > Corum Province > Corum (0.04)
- (3 more...)
- Research Report > Experimental Study (0.46)
- Research Report > New Finding (0.45)
Regret Bounds for Episodic Risk-Sensitive Linear Quadratic Regulator
Xu, Wenhao, Gao, Xuefeng, He, Xuedong
Risk-sensitive linear quadratic regulator is one of the most fundamental problems in risk-sensitive optimal control. In this paper, we study online adaptive control of risk-sensitive linear quadratic regulator in the finite horizon episodic setting. We propose a simple least-squares greedy algorithm and show that it achieves $\widetilde{\mathcal{O}}(\log N)$ regret under a specific identifiability assumption, where $N$ is the total number of episodes. If the identifiability assumption is not satisfied, we propose incorporating exploration noise into the least-squares-based algorithm, resulting in an algorithm with $\widetilde{\mathcal{O}}(\sqrt{N})$ regret. To our best knowledge, this is the first set of regret bounds for episodic risk-sensitive linear quadratic regulator. Our proof relies on perturbation analysis of less-standard Riccati equations for risk-sensitive linear quadratic control, and a delicate analysis of the loss in the risk-sensitive performance criterion due to applying the suboptimal controller in the online learning process.
- Asia > China > Hong Kong (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Asia > Middle East > Jordan (0.04)